Knowledge-based mechanistic modeling accurately predicts disease progression with gefitinib in EGFR-mutant lung adenocarcinoma

Lung adenocarcinoma (LUAD) is associated with a low survival rate at advanced stages. Although the development of targeted therapies has improved outcomes in LUAD patients with identified and specific genetic alterations, such as activating mutations on the epidermal growth factor receptor gene (EGFR), the emergence of tumor resistance eventually occurs in all patients and this is driving the development of new therapies. In this paper, we present the In Silico EGFR-mutant LUAD (ISELA) model that links LUAD patients’ individual characteristics, including tumor genetic heterogeneity, to tumor size evolution and tumor progression over time under first generation EGFR tyrosine kinase inhibitor gefitinib. This translational mechanistic model gathers extensive knowledge on LUAD and was calibrated on multiple scales, including in vitro, human tumor xenograft mouse and human, reproducing more than 90% of the experimental data identified. Moreover, with 98.5% coverage and 99.4% negative logrank tests, the model accurately reproduced the time to progression from the Lux-Lung 7 clinical trial, which was unused in calibration, thus supporting the model high predictive value. This knowledge-based mechanistic model could be a valuable tool in the development of new therapies targeting EGFR-mutant LUAD as a foundation for the generation of synthetic control arms.


Supplementary material A -ISELA model documentation
The ISELA model is a knowledge-based mechanistic model designed to reproduce tumor size evolution and disease progression of virtual patients matching real-world patients with EGFR-mutant LUAD treated with gefitinib. The ISELA model is made of six submodels: (i) Tumor Growth, (ii) Tumor mutational profile, (iii) EGFR signaling pathway, (iv) Tumor heterogeneity, (v) Clinical submodel, (vi) Gefitinib treatment. For all these submodels, a detailed documentation was extracted from jinkō.ai platform and presented below. The implementation of these models is also provided in SBML (L3V2), XLSX and ANT format. These files were generated using the libAntimony tool (v2.13.2) (Medley et al, 2018;Choi et al, 2018).

A.1 Tumor Growth
A solid tumor can be seen as a group of tumor clones, where a tumor clone is a group of cells that harbor the same phenotype due to specific clonal mutations (see section A.2.1 for more details). Growth of the tumor clones is modeled with the Tumor Growth submodel, that focuses mainly on the growth and proliferation of tumor cells by describing three cell subpopulations inside the clone: proliferating cells, quiescent cells and dead cells.
A few hypotheses lie behind this modeling choice. First, we hypothesize that the supply of resources needed by the cells such as nutrients is perfectly correlated with oxygen supply: this assumption allow us to consider oxygen as a proxy of nutrients availability. Then, we also assume that the oxygen profile from the outside to the inside of the tumor is continuously decreasing, from physiological values at tumor edge. Then, by fixing two threshold values for oxygen concentration (such that both proliferating state and alive state are determined by two distinct oxygen concentration thresholds), we are able to consider two spatial depths thresholds in order to create three subpopulations implicitly taking nutrients into account.
Each tumor clone will therefore have its own set of proliferating and quiescent cells; but as the distinction is not relevant, a unique pool of dead cells is modeled, regardless of their original clone, as illustrated in the Supplementary Figure A.1, representing the cellular dynamics of a given clone.

A.1.1 Tumor geometry and model structure
We assume that the tumor is a perfect sphere with a central symmetry. The state of a tumor radius is thus representative of the entire tumor. The tumor volume can then be easily derived from the tumor radius. We postulate that an unknown oxygen concentration profile (or gradient) exists across that tumor radius, which may dynamically change with tumor size evolution. This profile is governed by diffusion and vasculature (transport) properties. One assume that this profile is continuous and monotonically decreasing from the outside of the tumor where oxygen concentration is at physiological levels, to the center.
We also assume that all vital resource supply such as nutrients for the cells are perfectly correlated with oxygen supply, i.e. nutrients are always available to same extent as oxygen concentration. Under this hypothesis, one can use the oxygen only as a proxy for all resources and consider it as the single limiting resource.
We then defined a cell type C i as the set of cells that benefits from an oxygen concentration in their direct environment between values o i−1 and o i+1 . By construction, if o 0 is the minimal oxygen concentration in the tumor and o N +1 the maximal oxygen concentration in the tumor, then all tumor cells belong to one of the N cell types defined in that way.
Based on this definition and our previous hypothesis on the oxygen profile, cell types are naturally ordered on the axis defined by the tumor radius and form distinct populations along it. Given this unidimensional geometry, each cell from a given type C i population can only interact with its neighbors (except for the border cell type populations C 0 and C N ) C i−1 and C i+1 .
We also hypothesize that the cell density within each cell type C i is equal.
We also assume that the non-tumor cells are negligible and are therefore not taken into account for the computation of the size of the tumor.
We then make the hypothesis of the existence of the two following oxygen concentration thresholds: • Threshold T 1 under which tumor cells do not have enough oxygen to live so that die by necrosis. • Threshold T 2 under which tumor cells do not have enough oxygen to proliferate so that they pause the cell cycle.
This hypothesis implies that these three proliferating, quiescent (living but not proliferating) and dead states depends on the oxygen concentration and that unique and well-defined thresholds exist to clearly segregate any cells between these three states. This also implies that the transition between these states is instantaneous without any intermediate or transient state.
As these thresholds are shared by all cells, independently of their state or origin, they can be considered as an extrinsic biological property.
As proliferating cells are alive, we deduced that T 1 ≤ T 2 since proliferating cells are alive.
Based on these two thresholds, we defined N = 3 cell types, using o 1 = T 1 and o 2 = T 2 , that have the following characteristics: • C 1 cells are dead and cannot proliferate. They are called "dead" cells.
• C 2 cells are alive but can not proliferate. They are called "quiescent" cells.
• C 3 cells are alive and can proliferate. They are called "proliferating" cells.
As the oxygen concentration profile is assumed to be monotonically decreasing from the spherical tumor surface to its center, there can be determined equivalent depth thresholds characterizing at which depth within the tumor these oxygen concentration thresholds are reached. Then, assuming the tumor density is constant, this is equivalent to define critical cell numbers above which the tumor radius will be greater than these depth thresholds. We thus define: • The length l 1 such that all cells located at a lower depth from the tumor surface receive oxygen in levels greater than T 1 threshold level, and are thus proliferating ; or equivalently, the maximal number of cells N 1 that can be located within l 1 from the tumor surface • The length l 2 such that all cells located at a lower depth from the tumor surface receive oxygen in levels greater than T 2 threshold level, and are thus alive ; or equivalently, the maximal number of cells N 2 that can be located within l 2 from the tumor surface.
It should be noted that N 1 and N 2 necessarily depend on the tumor radius, assuming constant cell density, since the larger the tumor the more A.1 Tumor Growth v cells can fit in a l 1 or l 2 layer under the surface.
We chose not to model explicitly the spatial oxygen profile within the tumor by parameterizing the submodel in terms of spatial depth (or the equivalent view in terms of critical cell numbers) that implicitly takes into account the oxygen profile properties. These threshold parameters also implicitly take into account the mechanisms linked with tumor vasculature. Some specific processes such as neoangiogenesis are explicitly taken into account by modulating these parameters (see section A.1.4).
The submodel structure thus consists in a spherical tumor composed of a dead cell center, surrounded by a middle layer of quiescent cells, themselves surrounded by an external layer of proliferating cells. The model main variables are the number of cells within these three population which are dynamically described by considering the interactions between the cell types, the transition of cells from one type to another one, and the interactions of proliferating cells with their external environment. These different interactions or processes that are considered in the submodel are described in the following subsections.

A.1.2 Cell proliferation
Tumor Growth results from cell proliferation. The tumor is divided into two types of cells: dead cells modeled with the variable DeadTumoralCells and living cells with the variable LivingTumoralCells. As described with submodel structure, living cells may proliferate only if they are close enough to the tumor such that they receive enough oxygen. This maximal number of cells N 1 that can proliferate depends on tumor vasculature and is denoted with the parameter maxProlifCellsWithAngio when neoangiogenesis mechanisms are involved and with the parameter maxProlifCellsWithDiffusion otherwise. When the number of living cells is lower than this critical threshold, all living cells may proliferate, otherwise, only as many as cells as this critical number undergo proliferation.
We assumed that the transition between these quiescent and proliferating state is much faster in front of the other time scales in our model, that is can be modeled as instantaneous. The short time scales needed to describe cell cycle are not relevant to describe overall tumor size evolution.
Proliferation is modeled with saturated first-order kinetics, such that a single parameter characterizes the proliferation rate. The saturation was introduced to prevent the entire tumor from growing above a critical size called carrying capacity. This carrying capacity is set at the lung volume such that tumor size can not reach biologically unrealistic levels.
The proliferation rate is also modulated with a Hill function that depends on the EGFR signaling pathway output proliferation signal, parameter cellProlifEqProliferating in our model (see section A.4), in order to bridge intracellular signaling to cell behavior.

A.1.3 Cell death by necrosis
Quiescent cells that do not receive enough oxygen die by necrosis and their debris contribute to the formation of the necrotic dead core at the center of the tumor.
In a similar fashion proliferating cells were defined, cells are considered to receive enough oxygen to live as long as they remain above a critical cell number N 2 . This thresholds parameter depends on tumor vasculature and is denoted with the parameter maxLivingCellsWithAngio when neoangiogenesis mechanisms are involved and with the parameter maxLivingCellsWithDiffusion otherwise.
When the number of cells is greater than this critical threshold, cells start to die by necrosis. In this case, the death rate is assumed to be proportional to the difference between the total cell number and this critical threshold. When cells die, the tumor necrotic core increases by as much volume than previously occupied by these cells.

A.1.4 Neoangiogenesis
Oxygen delivery is reduced in the inner regions of the tumor, and the distance on which oxygen has to diffuse increases with tumor growth, reducing oxygen availability for tumor inner cells. Neoangiogenesis is a mechanism promoted by the tumor to overcome this oxygen shortage by inducing vasculature development in the tumor.
As previously defined, oxygen diffusion is accounted for in our model by two critical thresholds ruling if cells receive enough oxygen to either live and proliferate. In the absence of neoangiogenesis processes, we assumed that the maximal depth at which cells receive enough oxygen to survive is about 104 µm in vitro. With the constant cell density assumption, this allowed to derived a value for parameter maxLivingCellsWithDiffusion. We then made the assumption that the maximal number of cells that can proliferate, the parameter maxProliferatingCellsWithDiffusion, is a constant fraction of this former threshold.
We modeled neoangiogenesis as an additional mechanism that increases these two critical cell numbers, as neoangiogenesis allows oxygen to be available to more cells or in higher levels. The parameter accounting for the maximal number of living cells in the tumor maxLivingCellsWithNeoangiogenesis is thus assumed to be equal to maxLivingCellsWithDiffusion plus a parameter extraLivingCellsWithAngio that is much higher than the former one as neoangiogenesis-induced vasculature is assumed to be a A.2 Tumor heterogeneity vii more efficient mechanism than diffusion for delivering oxygen inside the tumor. extraLivingCellsWithAngio is constant in one patient, but can vary from one patient to another. Similarly as the modeling without neoangiogenesis, we assumed that the maximal number of cells that can proliferate taking int account neoangiogenesis, the parameter maxProliferatingCellsWithNeoangiogenesis, is a constant fraction of maxLivingCellsWithNeoangiogenesis.

A.1.5 Immune system
The ISELA model focuses on late cancer stages, hence when the immune system -tumor setting is likely to be close to an equilibrium. We thus decided set the number of immune system cells constant in the tumor micro-environment.

A.2 Tumor heterogeneity
This submodel aims at accounting for the tumor heterogeneity that is characterized by the diversity of mutations, genotypic alterations and phenotypic modifications that the restricted set of driver mutations explicitly modeled can not represent (see section A.3). The tumor heterogeneity submodel thus adds more complexity to the Tumor Growth submodel in order to account for this heterogeneity observed in real tumors.
The rationale to add more complexity to the Tumor Growth submodel is that tumors may be composed of different regions that growth and respond to treatment in different extent. In particular, if all cells in the tumor were homogeneous in our model, if a treatment-induced resistance alteration appears (see section A.2.2), all cells would benefit from it, and the tumor would instantaneously relapse, which was deemed tautological and unrealistic. We thus estimated that accounting for intratumor heterogeneity was necessary in order to represent more realistically the tumor composition and behavior. Then, if a resistance alteration would appear in a fraction of cells only, these cells would earn a proliferative advantage that would lead them to expand competitively, which is closer to the expected biological behavior.

A.2.1 Tumor clonality
We thus decided to model a population of clones in the tumor, each clone being characterized by a different mutational signature compared to the other ones. In our model, the tumor cells are split in as many populations as the number of clones coexisting in the tumor.
Each of these populations is ruled by the equations of the Tumor Growth submodel and EGFR pathway submodel, which are thus duplicated for each clone. Each clone is assumed to be an independent sub-tumor, driven by these equations, not interacting directly with the other clones.

A -ISELA MODEL DOCUMENTATION
To account for clone interactions and integration of clones within an overall tumor, the following changes are made: • The dead cell population was not duplicated for all clones, as these consists in necrotic debris that stand passively in the model, and their mutational profile is thus not impactful. The DeadTumoralCells variable is thus not duplicated and all quiescent cell dying are integrated to this common compartment independently of their properties. • We assumed that the heterogeneous tumor keeps its perfectly separated dead and living layers, and that each cell type in each clone occupies an angular sector or fraction of the entire tumor type layer. • When the tumor is made of more than one clone, transition rates between one cell type to the other are scaled not with the cell type surface, but with the portion of the cell type surface visible by the clone cell type. This thus add a new ratio to the rates equal to the number of cells of the clone of this cell type over the total number of cells of this cell type in this tumor. This correction is necessary to preserve the mass balance in this new model structure.
In addition to the number of clones for which we know the distribution from literature -thus a maximal number of 15 is created in the model -another clone is created, named the clone 0, in order to represent the clone that may acquire a resistance mutation and expand upon treatment administration.
To represent the heterogeneity between clones in our model, we identified parameters to account for the impact of unmodeled mutations on cancer hallmark mechanisms. The only exception is for the neoangiogenesis hallmark for which there no parameter defined differently between clones. These parameters may have different values between clones within the same tumor, and are listed below: • deltaCellProlifEq parameter, which accounts for the impact of mutations on "proliferation" hallmark, as this parameter corresponds to the threshold of the Hill function used to link EGFR proliferation signaling to tumor proliferation rate. • deltaCellDeathInhibEq parameter, which accounts for the impact of mutations on "cell death inhibition" hallmark, as this parameter corresponds to the threshold of the Hill function used to link EGFR death inhibition signaling to tumor death rate. • maxCellDeathInhibEq parameter, which accounts for the impact of mutations on the "immune escape hallmark" as this parameter represents the maximal rate at which tumor cells may be killed by the immune system.
At the beginning of the simulation, each clone is assumed to share the same volume percentage of the initial tumor, i.e. they all have the same size.

A.2.2 Treatment-induced resistance-conferring alteration apparition
After treatment initiation, a resistance-conferring alteration may expand in the tumor. These alterations are named treatment-induced alterations, as opposed to driver alterations, in order to easily distinguish them, although both can confer a proliferative or survival advantage to the tumor. The naming difference refers to the time of apparition of these alterations: driver alterations exist prior treatment application, whereas treatment-induced alterations appear afterwards.
We selected the following treatment-induced alterations to account for in our model: It is a knowledge gap whether treatment-induced alterations appear due to selection of cells that initially possess the alteration and thus gained a competitive advantage, or whether it appears by acquisition of that alteration due to selection pressure. We tested several hypothesis and favored the latter "selection" hypothesis. At the beginning of the simulation, a subclone is created from a parent clone to share the same characteristics except that this subclone already possesses the resistance mutation. This subclone remains neutral but might be selected upon treatment administration. We also assume that this clone is randomly chosen between all the pre-existing clones with equal probabilities. The initial size of this subclone (i.e. the number of cells possessing this mutation) is called resistantSubCloneInitialSize. This resistant subclone is defined as the clone 0.
We thus added the following descriptors the Vpop, in addition to the descriptors listed in the Tumor heterogeneity subsection: • The type of alteration induced by the treatment (see list above). This is described by the treatmentInducedAlteration parameter. However, we considered that patients can not have a treatment-induced alteration they already have as driver alteration. • The number of cells initially possessing the resistance mutation. this is described by the resistantSubCloneInitialSize parameter, which was calibrated during the model building. • The number of the parent clone from which the resistance subclone originates. This is described by the resistantSubcloneParentNumber parameter.
We made the hypothesis that EGFR T790M mutation can not appear in a patient already bearing an exon 20 insertion since both mutations are located on exon 20 and we found no co-occurence evidence in the literature. We also assume that, when two mutation resistance appears, they appear on the same clone.

A.3 Tumor mutational profile
This submodel aims at describing the variability shown in the real population to account for in the virtual population used for simulations on the integrated model. The description of each virtual patient lung tumor notably includes genetic alterations, clone number and clone sizes. These characteristics should be representative, in terms of distributions and correlations with patient descriptors, of what is observed in the real population.
We decided to include in our model only the driver alterations for which the protein was explicitely modeled in our EGFR signaling pathway submodel and for which alteration incidence was available. Thus, the following alterations are modeled: We made the hypothesis to consider that all of these alterations are actually mutations and not amplifications or other genetic abnormalities. Furthermore, we made the hypothesis that all of these mutations are clonal (i.e. present in all clones).
The list of the main patient descriptors included in the virtual population is the following: • Driver gene mutations (see list above), which incidences and associated correlations were estimated from literature. This is described by the isXmutated parameters for all mutations X. • Type of EGFR specific mutation (see list above) which incidence was estimated from the literature and associated correlations from CLCGP data. This is described by the mutEGFR parameter. • Number of clones, which distribution was estimated from Hanjani's study, by assuming that clones represent mutational clusters. A discrete lognormal distribution was manually fitted to account for a mean number of clones of 5 and values ranging between 2 and 15. This is described by the nClones parameter. • Age, which distribution and associated correlation were estimated from CLCGP data. This is described by the age parameter.
A.3 Tumor mutational profile xi • Ethnicity, which distribution and associated correlations were estimated from SEER resumed data. This is described by the ethnicity parameter. • Sex, which distribution and associated correlations were estimated from CLCGP data. This is described by the isMale parameter. • Tumor extension parameter, which distribution was estimated from SEER resumed data. This is described by the extTumor parameter. • Smoking status, which incidence was estimated from CLCGP data. This is described by the smokingStatus parameter. • Tumor initial size, which distribution was estimated from CLCGP data. This is described by the initialTNM parameter. • N category, which distribution was estimated from CLCGP data (see section A.5 for more details). This is described by the initialTNM parameter. • M category, which distribution was estimated from CLCGP data (see section A.5 for more details). This is described by the initialTNM parameter.

A.3.1 Impact of driver alterations
Driver alterations confer a selective advantage to the tumor cells and clones by impacting one or several tumorigenesis-related mechanisms.
The effect of each selected mutation is modeled as explained in the following: • EGFR: 3 mutations are taken into account for EGFR: exon 19 deletion, exon 20 insertion, and exon 21 L858R point mutations. These mutations are implemented as having three different impacts: -EGFR mutations lead to constitutive activation of EGFR, therefore the downstream pathways of EGFR are activated as well. This is verified for exon 19 and 21 mutations and it is hypothesized in the literature that exon 20 mutations lead to the same effects. -EGFR mutations modify the affinity of gefitinib towards ATP (see treatmentInducedAlteration in section A.6). -EGFR mutations modify the inhibition constant of gefitinib (see K I in section A.6).
• PIK3CA: PIK3CA mutation effect is implemented as an increase in PIK3CA kinase activity, estimated from in vitro data. • BRAF: BRAF mutation has been reported to cause an increase in the kinase activity of MEK-phosphorylating RAF, resulting in an estimated 10-fold increase of activation, as estimated from the two most common mutations V599E and K600E. This mutation effect was implemented with such increase on the MEK activation rate in our model. • KRAS: KRAS mutations have been reported to impair 97-99% of the GTPase activity of RAS, keeping it into a constitutively active form. As such, this mutation effect has been implemented with such reduction impact on the RAS deactivation rate.
• MET: MET Exon 14 deletion impacts the protein internalization by decreasing its degradation rate. As such, MET mutation effect was modeled as a decrease in the MET internalization and degradation rate. In addition, as the mean MET gene copy number associated with this mutation is 3.7 in stage II to IV patients, a 3.7 multiplicative factor was implemented on the total MET concentration upon MET mutation.

A.4 EGFR signaling pathway
The objective of this submodel is to represent mechanistically EGFR (epidermal growth factor (EGF) receptor) -one receptor tyrosine kinase, RTKsignaling pathway at the molecular level, as the main pathway driving cell proliferation and survival in cancer. Based on the mutation present, and the treatment received, it will provide a cell proliferation signal and cell death inhibition signal to the Tumor Growth submodel. It also includes MET-related signaling pathways, as MET plays a critical role in resistance to EGFR inhibitor.
The submodel implements signal transduction from EGFR (EGFdependent and independent signaling) and activation of downstream mitogenactivated protein kinase (MAPK) and phosphatidylinositol 3-kinase/protein kinase B (PI3K/AKT) pathways. ERK (extracellular signal regulated kinase) and AKT (protein kinase B) are considered as surrogate of MAPK / PI3K/AKT pathways activation status respectively.

A.4.1 Receptor tyrosine kinase activation
This includes two parallel reactions: • EGFR activation by EGF • MET activation by HGF Each RTK activation is implemented as a bidirectional reaction, following it own mass constant rate (see below), dependent, for the forward reaction, of the ligand concentration. Activation constant of MET was assumed to be of the same order of magnitude as EGFR. • RAS activation by EGFR and MET • RAF activation by RAS • MEK activation by RAF • ERK activation by MEK Each protein activation is implemented as a bidirectional reaction, following it own mass constant rate, dependent, for the forward reaction, of the concentration of the activated protein of the previous step (e.g. the forward rate of activation of RAF will depend of activated RAS concentration -see EGFR activation).

A.4.2 MAPK pathway activation
The cell proliferation signal resulting from this pathway is implemented as the ratio of the concentration of activated ERK over total ERK concentration.

A.4.3 PI3K/AKT pathway activation
This includes three successive reactions: • PI3K activation by EGFR, MET and RAS • PIP2 phosphorylation into PIP3 by PI3K • AKT activation by PIP3 Each protein/compound activation is implemented as a bidirectional reaction, following it own mass constant rate, dependent, for the forward reaction, of the concentration of the activated protein/compound of the previous step (e.g. the forward rate of AKT activation will depend of activated PIP3 concentration -see EGFR activation). The dephosphorylation rate of PIP3 includes two components, one corresponding to PTEN activity (and proportional to PTEN concentration), one corresponding to PIP2/PIP3 concentration homeostasis (based on our hypothesis of constant concentration of all compounds).
The cell death inhibition signal resulting from this pathway is implemented as the ratio of the concentration of activated AKT over total AKT concentration.

A.5 Clinical submodel
The clinical submodel aims at:

A.5.1 Tumor stages -TNM classification
The clinical stage of each patient is determined through a descriptor initialTNM whose distribution was estimated from literature. Associated parameters encoding for the N category (nodalStatus parameter) and the M category (metastasisStatus parameter) are directly deduced from this former descriptor. These last two categories (N and M) are mainly used in the model when generating virtual population, to take into account the correlation with mutational burden.
The initial T category of an in silico patient impacts its initial tumor size. Each T category is defined by a range of tumor diameter. We arbitrarily set the maximal initial tumor size in the T4 category at 10 cm, and assumed that tumor size distribution within each category was uniform. The uniform distribution of tumor diameter for each stage in our model are then the following ones: • Between 0 and 3 cm (diameter) for T1 • Between 3 and 5 cm for T2 • Between 5 and 7 cm for T3 • Between 7 and 10 cm for T4 Note that for initial T categories, subcategories (e.g. T1a, T1b) were not implemented as their respective abundance was not available in the reference data used to build the model.

A.5 Clinical submodel
xv Throughout the simulation, the T category of an in silico patient tumor may change depending on the evolution of the tumor size. The T category, implemented with tumorCategory parameter, can then be computed dynamically based on the value of tumorRadius : • For tumorRadius = 0, T category is T0.
During the simulation, the clinical stage of an in silico patient, implemented as clinicalStage parameter, is then continuously inferred from tumor T, N and M category parameters introduced above.

A.5.2 Time to progression (TTP)
We chose to implement the time to progression (TTP) as the clinical endpoint or our model. It is defined as the time elapsed between treatment initiation and tumor progression. TTP is increasingly more often used in clinical trial designs because of feasibility issues (smaller sample sizes and shorter followup).
Based on this definition, TTP was easily implemented in our model as the time elapsed between treatment initiation and the time at which the tumor growth rate turns positive again. It should be noted that, because tumor progression is only assessed during follow-up visits in real life setting, the simulated TTP is expected to be lower than the observed TTP.
Three estimates of TTP are modeled to take into account various factors impacting TTP measurement in clinical practice: • Absolute TTP: The first TTP estimate is named the "absolute" time to progression and follows the exact definition of TTP. It thus consists in the time difference between the time of treatment initiation and the first time at which the derivative of the overall tumor cell number becomes positive. It is implemented as the absoluteTimeToProgression parameter. • Measurable TTP: The second TTP estimate is named the "measurable" time to progression, and takes into account the measure uncertainty of imaging techniques. Indeed, typical imagery techniques such as computed tomography (CT) scan have a spatial resolution that does not allow to determine if the tumor size has increased if the increase is smaller than image resolution. We considered a typical resolution of 1 mm from CT recommendations implemented with radialCTresolution. parameter. The measureable TTP is thus defined as the time elapsed between treatment initiation and the time when the tumor radius has increased by at least the the spatial resolution size. It is implemented as the measureableTimeToProgression parameter. • Clinical TTP: The third TTP estimate is named the "clinical" time to progression, and takes into account the clinical definition of progression as defined in RECIST 1.1 guidelines. The clinical TTP is thus defined as the time elapsed between treatment initiation and the the time when the tumor radius has increased by at least 20% and at least 2.5 mm, both from the minimal size reached after treatment initiation. It is implemented as the timeToClinicalProgression parameter.
Based on these three definitions, timeToClinicalProgression should most closely reproduce the TTP reported in clinical studies. TTP will thus be calibrated on timeToClinicalProgression. The absoluteTimeToProgression and the measureableTimeToProgression will not be used in the calibration process but are interesting for analysis purposes as they bear information not easily measurable in clinical practice.

A.6 Gefitinib treatment
This submodel aims is to represent pharmacokinetic and pharmacodynamic features of the first-generation EGFR-TKI drug gefitinib.
Pharmacokinetics describes how the body affects drug exposure through absorption, distribution, metabolism and excretion of the drug. Drug exposure is typically monitored in plasma, but sometimes also in the tissue where the compound exerts its pharmacodynamic effect. Pharmacodynamics describes the pharmacologic effect of the administered drug.

A.6.1 Steady-state plasma exposure profile
We model the evolution in time of the total plasma concentration of gefitinib following daily oral administration with a 'fit-for-purpose' one-compartment linear pharmacokinetic model. Linear modeling, often associated with a one-compartment representation, is supported by preclinical studies, phase I studies and population PK models of gefitinib and is implemented here as first choice in order to reproduce plasma exposure of the compounds. The top-down approach represented by a non-physiologically-based pharmacokinetic model is the simplest way to reproduce plasma PK properties and the associated inter-patient variability, despite the fact that it does not allow a mechanistic representation of all the phenomena that influence plasma concentration. Nevertheless, the impact of phenomena related to the ADME (i.e. administration, distribution, metabolism and excretion) processes can still be taken into account by adjusting the kinetic parameters of the PK model.
The steady-state plasma exposure profile of the parent drug is determined by the equilibration between the rates of two reactions, one representing drug Supplementary Figure A3 Schematic representation of the submodel structure. The goal is to reproduce plasma and tumor tissue drug concentrations (respectively Cp and Ce) as a function of time. A/ Cp is described by a one-compartment PK model, where the bioavailable fraction (F) of the administered drug is absorbed with an absorption rate kab. The drug is eliminated from plasma at an elimination rate ke. B/ Cp is input into the tumor tissue compartment, where the drug concentration Ce varies in time according to the transfer rates k1e and k0e respectively from and to the plasma compartment. The inhibition of EGFR in the tumor tissue depends on Ce. Cp can be modeled by solving a system of ordinary differential equations (ODEs) (as done in A/) but it is computationally expensive. Instead, an analytical function (called the forcing function, f(t)) that uses the same parameters (F, kab and ke) is adopted to reproduce Cp at a low computational cost and is fitted to observed plasma concentration-time data. The forcing function is input into the tumor tissue compartment. Note that the forcing function is only a function of time (f(t)) and that Ce does not affect Cp under the assumption that the tumor tissue compartment is sufficiently small. absorption and the other one representing drug elimination. Both are reactions with first-order kinetics determining respectively absorption of the parent drug after oral administration to the plasma compartment and disappearance from it, so that: where: • C p is the total (bound + unbound) parent drug concentration in the plasma compartment • A p is the total amount of parent drug in the plasma compartment • A d is the total amount of administered drug • F is the bioavailability • k ab is the first-order absorption rate constant • k e is the first-order elimination rate constant • Cl/F is the clearance • Vd/F is the apparent volume of distribution and parameters k ab , Cl/F and Vd/F scale with body weight according to the formula: Additionally, the allometrically scaled parameters Cl/F and V d/F both vary with age. Cl/F also varies with serum α1-acid glycoprotein concentration (mg/dL) and concomitant use of CYP3A4 inducers. The mathematical formulation of the dependency of Cl/F and V d/F from the listed covariates is described in a published population pharmacokinetic model of oral gefitinib in NSCLC patients.

A.6.2 Drug exposure in tumor tissue
The tumor tissue compartment is the compartment where the drug exerts its pharmacological effect. The steady-state total tumor tissue concentration of the parent drug is determined by the equilibration between the rates of two reactions, one representing drug entering the tumor tissue compartment from the plasma compartment and the other one representing drug elimination from the tumor tissue compartment. Both are reactions with first-order kinetics, so that: • C p is the total (bound + unbound) parent drug concentration in the plasma compartment • Ce is the total (bound + unbound) parent drug concentration in the tumor tissue compartment • k 1e is the first-order rate constant for transferring the drug from the plasma compartment to the tumor tissue compartment • k 0e is the first-order rate constant for drug elimination from the tumor tissue compartment

A.6.3 EGFR inhibition in tumor tissue
Gefitinib is a reversible inhibitor of EGFR by competing with ATP at the intracellular kinase domain, inhibiting its autophosphorylation and, thus, attenuating the activation of intracellular signaling cascades.
We model the competitive inhibition of EGFR autophosphorylation in the tumor tissue following classical Michaelis-Menten equations for competitive inhibition: where: • EGFR is the activated (non phosphoriylated) EGFR isoform • ATP is the ATP • K M is the ATP Michaelis-Menten constant relative to a given EGFR isoform • Ce is the total (bound + unbound) parent drug concentration in the tumor tissue compartment • K I is the inhibition constant • fu is the unbound fraction of the parent drug in the tumor tissue

Derivation of unpublished inhibitory constant values
We model the following EGFR isoforms: • mutant A763 Y764insFQEA (exon 20 insertion) • mutant harboring a generic resistant exon 20 insertion • double mutant L858R/T790M • double mutant Del19/T790M K i , the inhibition (or inhibitory) constant represents the affinity of the inhibitor for the enzyme (the lower the value, the higher the affinity). K i values as experimentally estimated by biochemical assays are not available for all the aforementioned mutants. In absence of an experimental value, we derive it from relevant IC 50 data about EGFR autophosphorylation inhibition, according to the Cheng-Prusoff relationship:

Supplementary material B -References
A thorough review of more than 250 scientific papers was performed to develop the ISELA model. The complete list is provided below.   (11)